A scalable software-architecture for high-speed color Document compression based on JPEG2000 (Part 6)
Identifieur interne : 001A00 ( Main/Exploration ); précédent : 001999; suivant : 001A01A scalable software-architecture for high-speed color Document compression based on JPEG2000 (Part 6)
Auteurs : Michael Thierschmann [Allemagne] ; Uwe-Erik Martin [Allemagne]Source :
- SPIE proceedings series [ 1017-2653 ] ; 2002.
Descripteurs français
- Pascal (Inist)
- Reconnaissance caractère, Reconnaissance forme, Reconnaissance optique caractère, Texte, Traitement document, Performance système, Système gestion base donnée, Banque image, Compression image, Résolution image, Architecture logiciel, Gestion système, Balayage, Gestion document, Lisibilité, JPEG2000.
English descriptors
- KwdEn :
Abstract
The processing of colored documents with Document Management Systems (DMS) is possible with the modern document scanning systems today. Because of the enormous amount of image data generated scanning a typical A4 document with a 300 dpi resolution, image compression is used. The JPEG compression scheme is widely used for such image data. The lack of image quality caused by necessary lossy compression, can significantly reduce the recognition quality of a subsequent optical character recognition (OCR) process, which is essential to any DMS system. The new standard JPEG2000 (Part 6), a high performance system for compressing and archiving scanned documents, particularly those containing text and image, is overcoming the gap between high compression and legibility of documents suitable to be managed inside DMS systems. The utilization of JPEG2000 (Part 6) results in substantially higher image quality in comparison to standard compression techniques. This high quality is achieved by combining automatic text detection with bitonal compression of text and color/grayscale wavelet compression of images. Since the innovative JPEG2000 (Part 6) compression scheme is a complex image processing system, allocating some computational performance, a scalable software system has been designed to meet the throughput of high-performance document scanners.
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: 000647
- to stream PascalFrancis, to step Curation: 000145
- to stream PascalFrancis, to step Checkpoint: 000638
- to stream Main, to step Merge: 001A91
- to stream Main, to step Curation: 001A00
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">A scalable software-architecture for high-speed color Document compression based on JPEG2000 (Part 6)</title>
<author><name sortKey="Thierschmann, Michael" sort="Thierschmann, Michael" uniqKey="Thierschmann M" first="Michael" last="Thierschmann">Michael Thierschmann</name>
<affiliation wicri:level="3"><inist:fA14 i1="01"><s1>LuraTech GmbH, Rotherstrasse 20</s1>
<s2>10245 Berlin</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName><region type="land" nuts="3">Berlin</region>
<settlement type="city">Berlin</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Martin, Uwe Erik" sort="Martin, Uwe Erik" uniqKey="Martin U" first="Uwe-Erik" last="Martin">Uwe-Erik Martin</name>
<affiliation wicri:level="3"><inist:fA14 i1="01"><s1>LuraTech GmbH, Rotherstrasse 20</s1>
<s2>10245 Berlin</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName><region type="land" nuts="3">Berlin</region>
<settlement type="city">Berlin</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">02-0512026</idno>
<date when="2002">2002</date>
<idno type="stanalyst">PASCAL 02-0512026 INIST</idno>
<idno type="RBID">Pascal:02-0512026</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000647</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000145</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000638</idno>
<idno type="wicri:doubleKey">1017-2653:2002:Thierschmann M:a:scalable:software</idno>
<idno type="wicri:Area/Main/Merge">001A91</idno>
<idno type="wicri:Area/Main/Curation">001A00</idno>
<idno type="wicri:Area/Main/Exploration">001A00</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">A scalable software-architecture for high-speed color Document compression based on JPEG2000 (Part 6)</title>
<author><name sortKey="Thierschmann, Michael" sort="Thierschmann, Michael" uniqKey="Thierschmann M" first="Michael" last="Thierschmann">Michael Thierschmann</name>
<affiliation wicri:level="3"><inist:fA14 i1="01"><s1>LuraTech GmbH, Rotherstrasse 20</s1>
<s2>10245 Berlin</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName><region type="land" nuts="3">Berlin</region>
<settlement type="city">Berlin</settlement>
</placeName>
</affiliation>
</author>
<author><name sortKey="Martin, Uwe Erik" sort="Martin, Uwe Erik" uniqKey="Martin U" first="Uwe-Erik" last="Martin">Uwe-Erik Martin</name>
<affiliation wicri:level="3"><inist:fA14 i1="01"><s1>LuraTech GmbH, Rotherstrasse 20</s1>
<s2>10245 Berlin</s2>
<s3>DEU</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Allemagne</country>
<placeName><region type="land" nuts="3">Berlin</region>
<settlement type="city">Berlin</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
<imprint><date when="2002">2002</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">SPIE proceedings series</title>
<idno type="ISSN">1017-2653</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Character recognition</term>
<term>Database management system</term>
<term>Document management</term>
<term>Document processing</term>
<term>Image compression</term>
<term>Image databank</term>
<term>Image resolution</term>
<term>Legibility</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Scanning</term>
<term>Software architecture</term>
<term>System management</term>
<term>System performance</term>
<term>Text</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Reconnaissance caractère</term>
<term>Reconnaissance forme</term>
<term>Reconnaissance optique caractère</term>
<term>Texte</term>
<term>Traitement document</term>
<term>Performance système</term>
<term>Système gestion base donnée</term>
<term>Banque image</term>
<term>Compression image</term>
<term>Résolution image</term>
<term>Architecture logiciel</term>
<term>Gestion système</term>
<term>Balayage</term>
<term>Gestion document</term>
<term>Lisibilité</term>
<term>JPEG2000</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">The processing of colored documents with Document Management Systems (DMS) is possible with the modern document scanning systems today. Because of the enormous amount of image data generated scanning a typical A4 document with a 300 dpi resolution, image compression is used. The JPEG compression scheme is widely used for such image data. The lack of image quality caused by necessary lossy compression, can significantly reduce the recognition quality of a subsequent optical character recognition (OCR) process, which is essential to any DMS system. The new standard JPEG2000 (Part 6), a high performance system for compressing and archiving scanned documents, particularly those containing text and image, is overcoming the gap between high compression and legibility of documents suitable to be managed inside DMS systems. The utilization of JPEG2000 (Part 6) results in substantially higher image quality in comparison to standard compression techniques. This high quality is achieved by combining automatic text detection with bitonal compression of text and color/grayscale wavelet compression of images. Since the innovative JPEG2000 (Part 6) compression scheme is a complex image processing system, allocating some computational performance, a scalable software system has been designed to meet the throughput of high-performance document scanners.</div>
</front>
</TEI>
<affiliations><list><country><li>Allemagne</li>
</country>
<region><li>Berlin</li>
</region>
<settlement><li>Berlin</li>
</settlement>
</list>
<tree><country name="Allemagne"><region name="Berlin"><name sortKey="Thierschmann, Michael" sort="Thierschmann, Michael" uniqKey="Thierschmann M" first="Michael" last="Thierschmann">Michael Thierschmann</name>
</region>
<name sortKey="Martin, Uwe Erik" sort="Martin, Uwe Erik" uniqKey="Martin U" first="Uwe-Erik" last="Martin">Uwe-Erik Martin</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001A00 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001A00 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= Pascal:02-0512026 |texte= A scalable software-architecture for high-speed color Document compression based on JPEG2000 (Part 6) }}
This area was generated with Dilib version V0.6.32. |